Overview

Dataset statistics

Number of variables20
Number of observations18723
Missing cells74944
Missing cells (%)20.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 MiB
Average record size in memory136.0 B

Variable types

Numeric9
Categorical7
Unsupported4

Warnings

survey_id has constant value "1476" Constant
city has constant value "Amsterdam" Constant
name has a high cardinality: 18150 distinct values High cardinality
last_modified has a high cardinality: 18723 distinct values High cardinality
location has a high cardinality: 18723 distinct values High cardinality
room_id is highly correlated with host_idHigh correlation
host_id is highly correlated with room_idHigh correlation
accommodates is highly correlated with bedrooms and 1 other fieldsHigh correlation
bedrooms is highly correlated with accommodatesHigh correlation
price is highly correlated with accommodatesHigh correlation
room_id is highly correlated with reviewsHigh correlation
reviews is highly correlated with room_id and 1 other fieldsHigh correlation
overall_satisfaction is highly correlated with reviewsHigh correlation
accommodates is highly correlated with bedrooms and 1 other fieldsHigh correlation
bedrooms is highly correlated with accommodates and 1 other fieldsHigh correlation
price is highly correlated with accommodates and 1 other fieldsHigh correlation
reviews is highly correlated with overall_satisfactionHigh correlation
overall_satisfaction is highly correlated with reviewsHigh correlation
accommodates is highly correlated with bedroomsHigh correlation
bedrooms is highly correlated with accommodatesHigh correlation
accommodates is highly correlated with bedroomsHigh correlation
bedrooms is highly correlated with accommodatesHigh correlation
neighborhood is highly correlated with latitude and 1 other fieldsHigh correlation
host_id is highly correlated with room_idHigh correlation
room_id is highly correlated with host_idHigh correlation
latitude is highly correlated with neighborhood and 1 other fieldsHigh correlation
longitude is highly correlated with neighborhood and 1 other fieldsHigh correlation
neighborhood is highly correlated with survey_id and 1 other fieldsHigh correlation
room_type is highly correlated with survey_id and 1 other fieldsHigh correlation
survey_id is highly correlated with neighborhood and 2 other fieldsHigh correlation
city is highly correlated with neighborhood and 2 other fieldsHigh correlation
country has 18723 (100.0%) missing values Missing
borough has 18723 (100.0%) missing values Missing
bathrooms has 18723 (100.0%) missing values Missing
minstay has 18723 (100.0%) missing values Missing
name is uniformly distributed Uniform
last_modified is uniformly distributed Uniform
location is uniformly distributed Uniform
room_id has unique values Unique
last_modified has unique values Unique
location has unique values Unique
country is an unsupported type, check if it needs cleaning or further analysis Unsupported
borough is an unsupported type, check if it needs cleaning or further analysis Unsupported
bathrooms is an unsupported type, check if it needs cleaning or further analysis Unsupported
minstay is an unsupported type, check if it needs cleaning or further analysis Unsupported
reviews has 2984 (15.9%) zeros Zeros
overall_satisfaction has 5748 (30.7%) zeros Zeros
bedrooms has 1154 (6.2%) zeros Zeros

Reproduction

Analysis started2021-09-07 09:41:18.895388
Analysis finished2021-09-07 09:41:43.221514
Duration24.33 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

room_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct18723
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11205678.03
Minimum2818
Maximum20003728
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum2818
5-th percentile1018799.4
Q16050607.5
median12282874
Q316610843
95-th percentile19578785
Maximum20003728
Range20000910
Interquartile range (IQR)10560235.5

Descriptive statistics

Standard deviation6082192.263
Coefficient of variation (CV)0.5427777102
Kurtosis-1.22613161
Mean11205678.03
Median Absolute Deviation (MAD)5236951
Skewness-0.2542978146
Sum2.098039098 × 1011
Variance3.699306273 × 1013
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
190750701
 
< 0.1%
135010771
 
< 0.1%
162535571
 
< 0.1%
29531041
 
< 0.1%
38652101
 
< 0.1%
115970891
 
< 0.1%
13338861
 
< 0.1%
138062081
 
< 0.1%
35213991
 
< 0.1%
116148501
 
< 0.1%
Other values (18713)18713
99.9%
ValueCountFrequency (%)
28181
< 0.1%
32091
< 0.1%
201681
< 0.1%
254281
< 0.1%
254881
< 0.1%
278861
< 0.1%
286581
< 0.1%
288711
< 0.1%
290511
< 0.1%
295541
< 0.1%
ValueCountFrequency (%)
200037281
< 0.1%
199960911
< 0.1%
199956731
< 0.1%
199953271
< 0.1%
199952461
< 0.1%
199951061
< 0.1%
199942621
< 0.1%
199926771
< 0.1%
199925961
< 0.1%
199922411
< 0.1%

survey_id
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
1476
18723 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters74892
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1476
2nd row1476
3rd row1476
4th row1476
5th row1476

Common Values

ValueCountFrequency (%)
147618723
100.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
147618723
100.0%

Most occurring characters

ValueCountFrequency (%)
118723
25.0%
418723
25.0%
718723
25.0%
618723
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number74892
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
118723
25.0%
418723
25.0%
718723
25.0%
618723
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common74892
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
118723
25.0%
418723
25.0%
718723
25.0%
618723
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII74892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
118723
25.0%
418723
25.0%
718723
25.0%
618723
25.0%

host_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct15943
Distinct (%)85.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35776116.68
Minimum2234
Maximum141831915
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum2234
5-th percentile1477396.3
Q17140879
median19886414
Q352026801
95-th percentile121891620.7
Maximum141831915
Range141829681
Interquartile range (IQR)44885922

Descriptive statistics

Standard deviation37581025.87
Coefficient of variation (CV)1.050450114
Kurtosis0.4901090157
Mean35776116.68
Median Absolute Deviation (MAD)15783470
Skewness1.243881153
Sum6.698362327 × 1011
Variance1.412333505 × 1015
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4870338593
 
0.5%
11397756488
 
0.5%
146451071
 
0.4%
10774514264
 
0.3%
8445374061
 
0.3%
6585999054
 
0.3%
51721552
 
0.3%
4669167243
 
0.2%
8444958937
 
0.2%
66917836
 
0.2%
Other values (15933)18124
96.8%
ValueCountFrequency (%)
22341
< 0.1%
31591
< 0.1%
38061
< 0.1%
59882
< 0.1%
79241
< 0.1%
120851
< 0.1%
204051
< 0.1%
340801
< 0.1%
367011
< 0.1%
407861
< 0.1%
ValueCountFrequency (%)
1418319151
 
< 0.1%
1417491091
 
< 0.1%
1417478151
 
< 0.1%
1416651484
< 0.1%
1416580221
 
< 0.1%
1416486821
 
< 0.1%
1415512111
 
< 0.1%
1415487051
 
< 0.1%
1415423511
 
< 0.1%
1415346021
 
< 0.1%

room_type
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
Entire home/apt
14978 
Private room
3682 
Shared room
 
63

Length

Max length15
Median length15
Mean length14.39657106
Min length11

Characters and Unicode

Total characters269547
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowShared room
2nd rowShared room
3rd rowShared room
4th rowShared room
5th rowShared room

Common Values

ValueCountFrequency (%)
Entire home/apt14978
80.0%
Private room3682
 
19.7%
Shared room63
 
0.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
home/apt14978
40.0%
entire14978
40.0%
room3745
 
10.0%
private3682
 
9.8%
shared63
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e33701
12.5%
t33638
12.5%
r22468
8.3%
o22468
8.3%
a18723
 
6.9%
18723
 
6.9%
m18723
 
6.9%
i18660
 
6.9%
h15041
 
5.6%
E14978
 
5.6%
Other values (7)52424
19.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter217123
80.6%
Uppercase Letter18723
 
6.9%
Space Separator18723
 
6.9%
Other Punctuation14978
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e33701
15.5%
t33638
15.5%
r22468
10.3%
o22468
10.3%
a18723
8.6%
m18723
8.6%
i18660
8.6%
h15041
6.9%
n14978
6.9%
p14978
6.9%
Other values (2)3745
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
E14978
80.0%
P3682
 
19.7%
S63
 
0.3%
Space Separator
ValueCountFrequency (%)
18723
100.0%
Other Punctuation
ValueCountFrequency (%)
/14978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin235846
87.5%
Common33701
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e33701
14.3%
t33638
14.3%
r22468
9.5%
o22468
9.5%
a18723
7.9%
m18723
7.9%
i18660
7.9%
h15041
6.4%
E14978
6.4%
n14978
6.4%
Other values (5)22468
9.5%
Common
ValueCountFrequency (%)
18723
55.6%
/14978
44.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII269547
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e33701
12.5%
t33638
12.5%
r22468
8.3%
o22468
8.3%
a18723
 
6.9%
18723
 
6.9%
m18723
 
6.9%
i18660
 
6.9%
h15041
 
5.6%
E14978
 
5.6%
Other values (7)52424
19.4%

country
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing18723
Missing (%)100.0%
Memory size146.3 KiB

city
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
Amsterdam
18723 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters168507
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAmsterdam
2nd rowAmsterdam
3rd rowAmsterdam
4th rowAmsterdam
5th rowAmsterdam

Common Values

ValueCountFrequency (%)
Amsterdam18723
100.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
amsterdam18723
100.0%

Most occurring characters

ValueCountFrequency (%)
m37446
22.2%
A18723
11.1%
s18723
11.1%
t18723
11.1%
e18723
11.1%
r18723
11.1%
d18723
11.1%
a18723
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter149784
88.9%
Uppercase Letter18723
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m37446
25.0%
s18723
12.5%
t18723
12.5%
e18723
12.5%
r18723
12.5%
d18723
12.5%
a18723
12.5%
Uppercase Letter
ValueCountFrequency (%)
A18723
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin168507
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m37446
22.2%
A18723
11.1%
s18723
11.1%
t18723
11.1%
e18723
11.1%
r18723
11.1%
d18723
11.1%
a18723
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII168507
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m37446
22.2%
A18723
11.1%
s18723
11.1%
t18723
11.1%
e18723
11.1%
r18723
11.1%
d18723
11.1%
a18723
11.1%

borough
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing18723
Missing (%)100.0%
Memory size146.3 KiB

neighborhood
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct23
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
De Baarsjes / Oud West
3289 
De Pijp / Rivierenbuurt
2378 
Centrum West
2225 
Centrum Oost
1730 
Westerpark
1430 
Other values (18)
7671 

Length

Max length38
Median length15
Mean length17.51572932
Min length6

Characters and Unicode

Total characters327947
Distinct characters43
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDe Pijp / Rivierenbuurt
2nd rowCentrum West
3rd rowWatergraafsmeer
4th rowCentrum West
5th rowDe Baarsjes / Oud West

Common Values

ValueCountFrequency (%)
De Baarsjes / Oud West3289
17.6%
De Pijp / Rivierenbuurt2378
12.7%
Centrum West2225
11.9%
Centrum Oost1730
9.2%
Westerpark1430
7.6%
Noord-West / Noord-Midden1418
7.6%
Oud Oost1169
 
6.2%
Bos en Lommer988
 
5.3%
Oostelijk Havengebied / Indische Buurt921
 
4.9%
Watergraafsmeer517
 
2.8%
Other values (13)2658
14.2%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
8985
15.9%
de5781
10.3%
west5755
10.2%
oud4952
 
8.8%
centrum4054
 
7.2%
baarsjes3289
 
5.8%
oost3217
 
5.7%
rivierenbuurt2378
 
4.2%
pijp2378
 
4.2%
westerpark1430
 
2.5%
Other values (27)14130
25.1%

Most occurring characters

ValueCountFrequency (%)
e41125
 
12.5%
37626
 
11.5%
r24877
 
7.6%
s22215
 
6.8%
t22148
 
6.8%
u17169
 
5.2%
d14742
 
4.5%
o14591
 
4.4%
i12545
 
3.8%
a11932
 
3.6%
Other values (33)108977
33.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter229288
69.9%
Uppercase Letter49212
 
15.0%
Space Separator37626
 
11.5%
Other Punctuation8985
 
2.7%
Dash Punctuation2836
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e41125
17.9%
r24877
10.8%
s22215
9.7%
t22148
9.7%
u17169
7.5%
d14742
 
6.4%
o14591
 
6.4%
i12545
 
5.5%
a11932
 
5.2%
n11659
 
5.1%
Other values (13)36285
15.8%
Uppercase Letter
ValueCountFrequency (%)
O9253
18.8%
W9135
18.6%
D5823
11.8%
B5644
11.5%
C4054
8.2%
N3906
7.9%
P2378
 
4.8%
R2378
 
4.8%
M1418
 
2.9%
I1299
 
2.6%
Other values (7)3924
8.0%
Space Separator
ValueCountFrequency (%)
37626
100.0%
Other Punctuation
ValueCountFrequency (%)
/8985
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2836
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin278500
84.9%
Common49447
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e41125
14.8%
r24877
 
8.9%
s22215
 
8.0%
t22148
 
8.0%
u17169
 
6.2%
d14742
 
5.3%
o14591
 
5.2%
i12545
 
4.5%
a11932
 
4.3%
n11659
 
4.2%
Other values (30)85497
30.7%
Common
ValueCountFrequency (%)
37626
76.1%
/8985
 
18.2%
-2836
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII327947
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e41125
 
12.5%
37626
 
11.5%
r24877
 
7.6%
s22215
 
6.8%
t22148
 
6.8%
u17169
 
5.2%
d14742
 
4.5%
o14591
 
4.4%
i12545
 
3.8%
a11932
 
3.6%
Other values (33)108977
33.2%

reviews
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct284
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.74154783
Minimum0
Maximum532
Zeros2984
Zeros (%)15.9%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median6
Q317
95-th percentile67
Maximum532
Range532
Interquartile range (IQR)15

Descriptive statistics

Standard deviation33.52262982
Coefficient of variation (CV)2.00236144
Kurtosis43.75643505
Mean16.74154783
Median Absolute Deviation (MAD)6
Skewness5.502786616
Sum313452
Variance1123.76671
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02984
 
15.9%
11510
 
8.1%
21246
 
6.7%
31103
 
5.9%
4925
 
4.9%
5876
 
4.7%
6741
 
4.0%
7683
 
3.6%
8590
 
3.2%
9529
 
2.8%
Other values (274)7536
40.2%
ValueCountFrequency (%)
02984
15.9%
11510
8.1%
21246
6.7%
31103
 
5.9%
4925
 
4.9%
5876
 
4.7%
6741
 
4.0%
7683
 
3.6%
8590
 
3.2%
9529
 
2.8%
ValueCountFrequency (%)
5321
< 0.1%
4651
< 0.1%
4631
< 0.1%
4521
< 0.1%
4471
< 0.1%
4432
< 0.1%
4331
< 0.1%
4302
< 0.1%
4251
< 0.1%
4102
< 0.1%

overall_satisfaction
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.301126956
Minimum0
Maximum5
Zeros5748
Zeros (%)30.7%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median4.5
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.213557519
Coefficient of variation (CV)0.6705460129
Kurtosis-1.317124352
Mean3.301126956
Median Absolute Deviation (MAD)0.5
Skewness-0.7927016145
Sum61807
Variance4.899836888
MonotonicityNot monotonic
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
57708
41.2%
05748
30.7%
4.54559
24.3%
4577
 
3.1%
3.5109
 
0.6%
319
 
0.1%
11
 
< 0.1%
2.51
 
< 0.1%
1.51
 
< 0.1%
ValueCountFrequency (%)
05748
30.7%
11
 
< 0.1%
1.51
 
< 0.1%
2.51
 
< 0.1%
319
 
0.1%
3.5109
 
0.6%
4577
 
3.1%
4.54559
24.3%
57708
41.2%
ValueCountFrequency (%)
57708
41.2%
4.54559
24.3%
4577
 
3.1%
3.5109
 
0.6%
319
 
0.1%
2.51
 
< 0.1%
1.51
 
< 0.1%
11
 
< 0.1%
05748
30.7%

accommodates
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct16
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.922021044
Minimum1
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum1
5-th percentile2
Q12
median2
Q34
95-th percentile5
Maximum17
Range16
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.327523924
Coefficient of variation (CV)0.4543170307
Kurtosis14.34067519
Mean2.922021044
Median Absolute Deviation (MAD)0
Skewness2.388797947
Sum54709
Variance1.762319769
MonotonicityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
210024
53.5%
45579
29.8%
31585
 
8.5%
6476
 
2.5%
5471
 
2.5%
1367
 
2.0%
8105
 
0.6%
752
 
0.3%
1620
 
0.1%
1016
 
0.1%
Other values (6)28
 
0.1%
ValueCountFrequency (%)
1367
 
2.0%
210024
53.5%
31585
 
8.5%
45579
29.8%
5471
 
2.5%
6476
 
2.5%
752
 
0.3%
8105
 
0.6%
98
 
< 0.1%
1016
 
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
1620
 
0.1%
146
 
< 0.1%
131
 
< 0.1%
1210
 
0.1%
112
 
< 0.1%
1016
 
0.1%
98
 
< 0.1%
8105
0.6%
752
0.3%

bedrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.430379747
Minimum0
Maximum10
Zeros1154
Zeros (%)6.2%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum10
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8790186913
Coefficient of variation (CV)0.6145351913
Kurtosis5.625756602
Mean1.430379747
Median Absolute Deviation (MAD)0
Skewness1.601304105
Sum26781
Variance0.7726738596
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
111101
59.3%
24456
23.8%
31444
 
7.7%
01154
 
6.2%
4473
 
2.5%
562
 
0.3%
619
 
0.1%
105
 
< 0.1%
74
 
< 0.1%
83
 
< 0.1%
ValueCountFrequency (%)
01154
 
6.2%
111101
59.3%
24456
23.8%
31444
 
7.7%
4473
 
2.5%
562
 
0.3%
619
 
0.1%
74
 
< 0.1%
83
 
< 0.1%
92
 
< 0.1%
ValueCountFrequency (%)
105
 
< 0.1%
92
 
< 0.1%
83
 
< 0.1%
74
 
< 0.1%
619
 
0.1%
562
 
0.3%
4473
 
2.5%
31444
 
7.7%
24456
23.8%
111101
59.3%

bathrooms
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing18723
Missing (%)100.0%
Memory size146.3 KiB

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct423
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean166.5994766
Minimum12
Maximum6000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum12
5-th percentile72
Q1108
median144
Q3192
95-th percentile330
Maximum6000
Range5988
Interquartile range (IQR)84

Descriptive statistics

Standard deviation108.9438487
Coefficient of variation (CV)0.6539267168
Kurtosis521.8652555
Mean166.5994766
Median Absolute Deviation (MAD)36
Skewness12.76898743
Sum3119242
Variance11868.76218
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1191023
 
5.5%
1801001
 
5.3%
144887
 
4.7%
150621
 
3.3%
132588
 
3.1%
108562
 
3.0%
96520
 
2.8%
114509
 
2.7%
118508
 
2.7%
240495
 
2.6%
Other values (413)12009
64.1%
ValueCountFrequency (%)
121
 
< 0.1%
181
 
< 0.1%
211
 
< 0.1%
221
 
< 0.1%
231
 
< 0.1%
246
< 0.1%
251
 
< 0.1%
281
 
< 0.1%
292
 
< 0.1%
306
< 0.1%
ValueCountFrequency (%)
60001
< 0.1%
37701
< 0.1%
19201
< 0.1%
17991
< 0.1%
15581
< 0.1%
14281
< 0.1%
14121
< 0.1%
13861
< 0.1%
13431
< 0.1%
13191
< 0.1%

minstay
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing18723
Missing (%)100.0%
Memory size146.3 KiB

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct18150
Distinct (%)97.2%
Missing52
Missing (%)0.3%
Memory size73.2 KiB
Amsterdam
 
36
Lovely apartment near Vondelpark
 
10
Spacious family house with garden
 
8
Cosy apartment in Amsterdam
 
8
Beautiful apartment in Amsterdam
 
8
Other values (18145)
18601 

Length

Max length78
Median length35
Mean length36.09233571
Min length1

Characters and Unicode

Total characters673880
Distinct characters157
Distinct categories20 ?
Distinct scripts4 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17814 ?
Unique (%)95.4%

Sample

1st rowRed Light/ Canal view apartment (Shared)
2nd rowSunny and Cozy Living room in quite neighbours
3rd rowAmsterdam
4th rowCanal boat RIDE in Amsterdam
5th rowOne room for rent in a three room appartment

Common Values

ValueCountFrequency (%)
Amsterdam36
 
0.2%
Lovely apartment near Vondelpark10
 
0.1%
Spacious family house with garden8
 
< 0.1%
Cosy apartment in Amsterdam8
 
< 0.1%
Beautiful apartment in Amsterdam8
 
< 0.1%
Magnificent panoramic city view8
 
< 0.1%
Lovely apartment in Amsterdam7
 
< 0.1%
Spacious apartment near Vondelpark7
 
< 0.1%
Nice comfy room, magnificent view7
 
< 0.1%
Apartment in Amsterdam6
 
< 0.1%
Other values (18140)18566
99.2%
(Missing)52
 
0.3%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
apartment7118
 
6.7%
in5730
 
5.4%
amsterdam3588
 
3.4%
3195
 
3.0%
with2669
 
2.5%
the2165
 
2.0%
spacious2082
 
2.0%
city1906
 
1.8%
centre1768
 
1.7%
room1728
 
1.6%
Other values (4867)73723
69.8%

Most occurring characters

ValueCountFrequency (%)
87491
 
13.0%
e59230
 
8.8%
t55217
 
8.2%
a52626
 
7.8%
r42831
 
6.4%
n39759
 
5.9%
o35472
 
5.3%
i32482
 
4.8%
m26379
 
3.9%
s21195
 
3.1%
Other values (147)221198
32.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter510398
75.7%
Space Separator87492
 
13.0%
Uppercase Letter54936
 
8.2%
Other Punctuation11184
 
1.7%
Decimal Number5572
 
0.8%
Dash Punctuation1595
 
0.2%
Math Symbol1136
 
0.2%
Close Punctuation621
 
0.1%
Open Punctuation588
 
0.1%
Other Symbol236
 
< 0.1%
Other values (10)122
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (22)22
56.4%
Lowercase Letter
ValueCountFrequency (%)
e59230
11.6%
t55217
10.8%
a52626
10.3%
r42831
 
8.4%
n39759
 
7.8%
o35472
 
6.9%
i32482
 
6.4%
m26379
 
5.2%
s21195
 
4.2%
p19825
 
3.9%
Other values (20)125382
24.6%
Uppercase Letter
ValueCountFrequency (%)
A8892
16.2%
C6863
12.5%
S4399
 
8.0%
L3283
 
6.0%
B3251
 
5.9%
R2791
 
5.1%
P2694
 
4.9%
E2341
 
4.3%
T2219
 
4.0%
N2194
 
4.0%
Other values (17)16009
29.1%
Other Punctuation
ValueCountFrequency (%)
,2817
25.2%
!2756
24.6%
&1686
15.1%
.1473
13.2%
'831
 
7.4%
/587
 
5.2%
@315
 
2.8%
"285
 
2.5%
:189
 
1.7%
*154
 
1.4%
Other values (7)91
 
0.8%
Decimal Number
ValueCountFrequency (%)
21885
33.8%
1992
17.8%
0741
 
13.3%
5498
 
8.9%
3463
 
8.3%
4412
 
7.4%
8150
 
2.7%
6150
 
2.7%
9145
 
2.6%
7136
 
2.4%
Other Symbol
ValueCountFrequency (%)
171
72.5%
33
 
14.0%
14
 
5.9%
5
 
2.1%
5
 
2.1%
3
 
1.3%
°3
 
1.3%
1
 
0.4%
1
 
0.4%
Math Symbol
ValueCountFrequency (%)
+660
58.1%
|460
40.5%
<5
 
0.4%
>4
 
0.4%
=3
 
0.3%
~2
 
0.2%
1
 
0.1%
÷1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
(581
98.8%
[6
 
1.0%
1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
)614
98.9%
]6
 
1.0%
1
 
0.2%
Space Separator
ValueCountFrequency (%)
87491
> 99.9%
 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-1593
99.9%
2
 
0.1%
Nonspacing Mark
ValueCountFrequency (%)
15
93.8%
1
 
6.2%
Control
ValueCountFrequency (%)
6
50.0%
6
50.0%
Final Punctuation
ValueCountFrequency (%)
9
81.8%
2
 
18.2%
Initial Punctuation
ValueCountFrequency (%)
3
60.0%
2
40.0%
Currency Symbol
ValueCountFrequency (%)
4
80.0%
$1
 
20.0%
Other Number
ValueCountFrequency (%)
²22
100.0%
Connector Punctuation
ValueCountFrequency (%)
_7
100.0%
Modifier Symbol
ValueCountFrequency (%)
´4
100.0%
Format
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin565334
83.9%
Common108491
 
16.1%
Han39
 
< 0.1%
Inherited16
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
87491
80.6%
,2817
 
2.6%
!2756
 
2.5%
21885
 
1.7%
&1686
 
1.6%
-1593
 
1.5%
.1473
 
1.4%
1992
 
0.9%
'831
 
0.8%
0741
 
0.7%
Other values (56)6226
 
5.7%
Latin
ValueCountFrequency (%)
e59230
 
10.5%
t55217
 
9.8%
a52626
 
9.3%
r42831
 
7.6%
n39759
 
7.0%
o35472
 
6.3%
i32482
 
5.7%
m26379
 
4.7%
s21195
 
3.7%
p19825
 
3.5%
Other values (47)180318
31.9%
Han
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (22)22
56.4%
Inherited
ValueCountFrequency (%)
15
93.8%
1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII673492
99.9%
Misc Symbols216
 
< 0.1%
Latin 1 Sup57
 
< 0.1%
CJK39
 
< 0.1%
Punctuation34
 
< 0.1%
VS16
 
< 0.1%
Dingbats14
 
< 0.1%
None7
 
< 0.1%
Currency Symbols4
 
< 0.1%
Math Operators1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
87491
 
13.0%
e59230
 
8.8%
t55217
 
8.2%
a52626
 
7.8%
r42831
 
6.4%
n39759
 
5.9%
o35472
 
5.3%
i32482
 
4.8%
m26379
 
3.9%
s21195
 
3.1%
Other values (82)220810
32.8%
Latin 1 Sup
ValueCountFrequency (%)
²22
38.6%
é15
26.3%
à4
 
7.0%
´4
 
7.0%
°3
 
5.3%
É3
 
5.3%
á2
 
3.5%
¡1
 
1.8%
 1
 
1.8%
÷1
 
1.8%
Misc Symbols
ValueCountFrequency (%)
171
79.2%
33
 
15.3%
5
 
2.3%
5
 
2.3%
1
 
0.5%
1
 
0.5%
None
ValueCountFrequency (%)
3
42.9%
2
28.6%
1
 
14.3%
1
 
14.3%
VS
ValueCountFrequency (%)
15
93.8%
1
 
6.2%
Dingbats
ValueCountFrequency (%)
14
100.0%
Punctuation
ValueCountFrequency (%)
15
44.1%
9
26.5%
3
 
8.8%
2
 
5.9%
2
 
5.9%
2
 
5.9%
1
 
2.9%
Math Operators
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
4
100.0%
CJK
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (22)22
56.4%

last_modified
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct18723
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
2017-07-22 16:33:46.800407
 
1
2017-07-23 02:50:22.347298
 
1
2017-07-23 02:58:55.274841
 
1
2017-07-22 16:18:32.872076
 
1
2017-07-22 16:35:28.316834
 
1
Other values (18718)
18718 

Length

Max length26
Median length26
Mean length26
Min length26

Characters and Unicode

Total characters486798
Distinct characters14
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18723 ?
Unique (%)100.0%

Sample

1st row2017-07-23 13:06:27.391699
2nd row2017-07-23 13:06:23.607187
3rd row2017-07-23 13:06:23.603546
4th row2017-07-23 13:06:22.689787
5th row2017-07-23 13:06:19.681469

Common Values

ValueCountFrequency (%)
2017-07-22 16:33:46.8004071
 
< 0.1%
2017-07-23 02:50:22.3472981
 
< 0.1%
2017-07-23 02:58:55.2748411
 
< 0.1%
2017-07-22 16:18:32.8720761
 
< 0.1%
2017-07-22 16:35:28.3168341
 
< 0.1%
2017-07-22 16:07:42.7624151
 
< 0.1%
2017-07-22 16:36:48.2997931
 
< 0.1%
2017-07-23 03:14:27.6920341
 
< 0.1%
2017-07-22 17:28:36.8768801
 
< 0.1%
2017-07-22 22:31:47.2003591
 
< 0.1%
Other values (18713)18713
99.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
2017-07-2213694
36.6%
2017-07-235029
 
13.4%
16:53:58.7887201
 
< 0.1%
16:26:25.9845061
 
< 0.1%
06:00:35.0584211
 
< 0.1%
22:58:38.7115191
 
< 0.1%
16:07:14.4090251
 
< 0.1%
22:31:51.4035921
 
< 0.1%
18:22:52.9482791
 
< 0.1%
05:53:53.0169821
 
< 0.1%
Other values (18715)18715
50.0%

Most occurring characters

ValueCountFrequency (%)
281800
16.8%
065421
13.4%
755572
11.4%
148766
10.0%
-37446
7.7%
:37446
7.7%
329626
 
6.1%
522515
 
4.6%
419641
 
4.0%
619288
 
4.0%
Other values (4)69277
14.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number374460
76.9%
Other Punctuation56169
 
11.5%
Dash Punctuation37446
 
7.7%
Space Separator18723
 
3.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
281800
21.8%
065421
17.5%
755572
14.8%
148766
13.0%
329626
 
7.9%
522515
 
6.0%
419641
 
5.2%
619288
 
5.2%
816269
 
4.3%
915562
 
4.2%
Other Punctuation
ValueCountFrequency (%)
:37446
66.7%
.18723
33.3%
Dash Punctuation
ValueCountFrequency (%)
-37446
100.0%
Space Separator
ValueCountFrequency (%)
18723
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common486798
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
281800
16.8%
065421
13.4%
755572
11.4%
148766
10.0%
-37446
7.7%
:37446
7.7%
329626
 
6.1%
522515
 
4.6%
419641
 
4.0%
619288
 
4.0%
Other values (4)69277
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII486798
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
281800
16.8%
065421
13.4%
755572
11.4%
148766
10.0%
-37446
7.7%
:37446
7.7%
329626
 
6.1%
522515
 
4.6%
419641
 
4.0%
619288
 
4.0%
Other values (4)69277
14.2%

latitude
Real number (ℝ≥0)

HIGH CORRELATION

Distinct15595
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.36526062
Minimum52.2962
Maximum52.42498
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum52.2962
5-th percentile52.3432883
Q152.3552535
median52.364628
Q352.3747975
95-th percentile52.3893716
Maximum52.42498
Range0.12878
Interquartile range (IQR)0.019544

Descriptive statistics

Standard deviation0.01514204237
Coefficient of variation (CV)0.0002891619786
Kurtosis1.418259861
Mean52.36526062
Median Absolute Deviation (MAD)0.009735
Skewness0.007905417451
Sum980434.7747
Variance0.0002292814473
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
52.3613645
 
< 0.1%
52.3546465
 
< 0.1%
52.3668525
 
< 0.1%
52.3605465
 
< 0.1%
52.3624534
 
< 0.1%
52.3611444
 
< 0.1%
52.3611184
 
< 0.1%
52.3671494
 
< 0.1%
52.3732594
 
< 0.1%
52.3699174
 
< 0.1%
Other values (15585)18679
99.8%
ValueCountFrequency (%)
52.29621
< 0.1%
52.2972031
< 0.1%
52.2997631
< 0.1%
52.2998461
< 0.1%
52.2998751
< 0.1%
52.3001051
< 0.1%
52.300131
< 0.1%
52.3009151
< 0.1%
52.3012571
< 0.1%
52.3016831
< 0.1%
ValueCountFrequency (%)
52.424981
< 0.1%
52.4246411
< 0.1%
52.4242551
< 0.1%
52.4236471
< 0.1%
52.4234981
< 0.1%
52.4234321
< 0.1%
52.4233211
< 0.1%
52.4228271
< 0.1%
52.4222321
< 0.1%
52.4222281
< 0.1%

longitude
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17157
Distinct (%)91.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.888585181
Minimum4.763264
Maximum5.027689
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size146.3 KiB

Quantile statistics

Minimum4.763264
5-th percentile4.8453101
Q14.8643445
median4.885994
Q34.90748
95-th percentile4.9445407
Maximum5.027689
Range0.264425
Interquartile range (IQR)0.0431355

Descriptive statistics

Standard deviation0.03453688189
Coefficient of variation (CV)0.007064801084
Kurtosis1.217000077
Mean4.888585181
Median Absolute Deviation (MAD)0.021585
Skewness0.5382471118
Sum91528.98035
Variance0.001192796211
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.9071875
 
< 0.1%
4.8935064
 
< 0.1%
4.8887384
 
< 0.1%
4.8930174
 
< 0.1%
4.8756114
 
< 0.1%
4.8912954
 
< 0.1%
4.8615124
 
< 0.1%
4.863014
 
< 0.1%
4.8770044
 
< 0.1%
4.8565254
 
< 0.1%
Other values (17147)18682
99.8%
ValueCountFrequency (%)
4.7632641
< 0.1%
4.7684521
< 0.1%
4.7691511
< 0.1%
4.7710831
< 0.1%
4.7727251
< 0.1%
4.7728221
< 0.1%
4.7751681
< 0.1%
4.7757481
< 0.1%
4.776471
< 0.1%
4.777641
< 0.1%
ValueCountFrequency (%)
5.0276891
< 0.1%
5.0267011
< 0.1%
5.0157371
< 0.1%
5.0135571
< 0.1%
5.0133161
< 0.1%
5.0130751
< 0.1%
5.0125491
< 0.1%
5.0116931
< 0.1%
5.0116881
< 0.1%
5.0115691
< 0.1%

location
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct18723
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
0101000020E610000005F9D9C87573134070404B57B02F4A40
 
1
0101000020E6100000EC4D0CC9C964134032C7F2AE7A304A40
 
1
0101000020E6100000E63BF88903681340F20A444FCA304A40
 
1
0101000020E6100000B9FC87F4DBD713406ADE718A8E304A40
 
1
0101000020E6100000ADFA5C6DC57E134007616EF7722F4A40
 
1
Other values (18718)
18718 

Length

Max length50
Median length50
Mean length50
Min length50

Characters and Unicode

Total characters936150
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18723 ?
Unique (%)100.0%

Sample

1st row0101000020E610000033FAD170CA8C13403BC5AA41982D4A40
2nd row0101000020E6100000842A357BA095134042791F4773304A40
3rd row0101000020E6100000A51133FB3CC613403543AA285E2B4A40
4th row0101000020E6100000DF180280638F134085EE92382B304A40
5th row0101000020E6100000CD902A8A57691340187B2FBE682F4A40

Common Values

ValueCountFrequency (%)
0101000020E610000005F9D9C87573134070404B57B02F4A401
 
< 0.1%
0101000020E6100000EC4D0CC9C964134032C7F2AE7A304A401
 
< 0.1%
0101000020E6100000E63BF88903681340F20A444FCA304A401
 
< 0.1%
0101000020E6100000B9FC87F4DBD713406ADE718A8E304A401
 
< 0.1%
0101000020E6100000ADFA5C6DC57E134007616EF7722F4A401
 
< 0.1%
0101000020E6100000CD751A69A99C1340CC441152B7354A401
 
< 0.1%
0101000020E6100000C6C03A8E1F6A13404833164D672D4A401
 
< 0.1%
0101000020E61000007F83F6EAE361134021AF0793E22F4A401
 
< 0.1%
0101000020E6100000DC291DACFFA313406D74CE4F712E4A401
 
< 0.1%
0101000020E6100000EA79371614A6134052F2EA1C032E4A401
 
< 0.1%
Other values (18713)18713
99.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
0101000020e610000022c5008926801340f418e599972f4a401
 
< 0.1%
0101000020e610000018213cda38b213408c31b08ee32f4a401
 
< 0.1%
0101000020e61000004c8a8f4fc86e1340d5963ac8eb2f4a401
 
< 0.1%
0101000020e61000004260e5d0227b13408655bc91792e4a401
 
< 0.1%
0101000020e61000006fd74b53049813406daaee91cd2d4a401
 
< 0.1%
0101000020e6100000f01307d0ef6b13402e3bc43f6c2f4a401
 
< 0.1%
0101000020e61000004e9a0645f3c013405589b2b7942f4a401
 
< 0.1%
0101000020e61000002575029a087b1340c362d4b5f62e4a401
 
< 0.1%
0101000020e6100000520dfb3db1fe1340855e7f129f2d4a401
 
< 0.1%
0101000020e61000003d1059a489871340639cbf0985304a401
 
< 0.1%
Other values (18713)18713
99.9%

Most occurring characters

ValueCountFrequency (%)
0289829
31.0%
1100408
 
10.7%
481235
 
8.7%
258034
 
6.2%
348096
 
5.1%
E47519
 
5.1%
646156
 
4.9%
A45108
 
4.8%
D28544
 
3.0%
828194
 
3.0%
Other values (6)163027
17.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number732898
78.3%
Uppercase Letter203252
 
21.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0289829
39.5%
1100408
 
13.7%
481235
 
11.1%
258034
 
7.9%
348096
 
6.6%
646156
 
6.3%
828194
 
3.8%
728082
 
3.8%
927873
 
3.8%
524991
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
E47519
23.4%
A45108
22.2%
D28544
14.0%
F28123
13.8%
C27427
13.5%
B26531
13.1%

Most occurring scripts

ValueCountFrequency (%)
Common732898
78.3%
Latin203252
 
21.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0289829
39.5%
1100408
 
13.7%
481235
 
11.1%
258034
 
7.9%
348096
 
6.6%
646156
 
6.3%
828194
 
3.8%
728082
 
3.8%
927873
 
3.8%
524991
 
3.4%
Latin
ValueCountFrequency (%)
E47519
23.4%
A45108
22.2%
D28544
14.0%
F28123
13.8%
C27427
13.5%
B26531
13.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII936150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0289829
31.0%
1100408
 
10.7%
481235
 
8.7%
258034
 
6.2%
348096
 
5.1%
E47519
 
5.1%
646156
 
4.9%
A45108
 
4.8%
D28544
 
3.0%
828194
 
3.0%
Other values (6)163027
17.4%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

room_idsurvey_idhost_idroom_typecountrycityboroughneighborhoodreviewsoverall_satisfactionaccommodatesbedroomsbathroomspriceminstaynamelast_modifiedlatitudelongitudelocation
010176931147649180562Shared roomNaNAmsterdamNaNDe Pijp / Rivierenbuurt74.521.0NaN156.0NaNRed Light/ Canal view apartment (Shared)2017-07-23 13:06:27.39169952.3562094.8874910101000020E610000033FAD170CA8C13403BC5AA41982D4A40
18935871147646718394Shared roomNaNAmsterdamNaNCentrum West454.541.0NaN126.0NaNSunny and Cozy Living room in quite neighbours2017-07-23 13:06:23.60718752.3785184.8961200101000020E6100000842A357BA095134042791F4773304A40
214011697147610346595Shared roomNaNAmsterdamNaNWatergraafsmeer10.031.0NaN132.0NaNAmsterdam2017-07-23 13:06:23.60354652.3388114.9435920101000020E6100000A51133FB3CC613403543AA285E2B4A40
3613797814768685430Shared roomNaNAmsterdamNaNCentrum West75.041.0NaN121.0NaNCanal boat RIDE in Amsterdam2017-07-23 13:06:22.68978752.3763194.8900280101000020E6100000DF180280638F134085EE92382B304A40
418630616147670191803Shared roomNaNAmsterdamNaNDe Baarsjes / Oud West10.021.0NaN93.0NaNOne room for rent in a three room appartment2017-07-23 13:06:19.68146952.3703844.8528730101000020E6100000CD902A8A57691340187B2FBE682F4A40
55790170147629968916Shared roomNaNAmsterdamNaNDe Pijp / Rivierenbuurt1844.521.0NaN102.0NaNBeautiful apartment2017-07-23 13:06:19.66397552.3422654.8971260101000020E6100000B090B932A896134060C8EA56CF2B4A40
693406014765037506Shared roomNaNAmsterdamNaNOostelijk Havengebied / Indische Buurt675.0161.0NaN462.0NaNLOTUS, Classic Dutch Saling Barge2017-07-23 13:06:09.98801652.3775524.9304180101000020E61000005D70067FBFB813400B45BA9F53304A40
7195900491476132687356Shared roomNaNAmsterdamNaNWesterpark20.021.0NaN414.0NaNbig boot Adam 042017-07-23 13:06:09.98474852.3752054.8661170101000020E6100000DD09F65FE7761340D925AAB706304A40
8502028014764059485Shared roomNaNAmsterdamNaNOud Oost20.021.0NaN222.0NaNBright modern appartment in East!2017-07-23 13:06:07.45260952.3573464.9128870101000020E610000032C687D9CBA613409FAD8383BD2D4A40
915810783147684978218Shared roomNaNAmsterdamNaNCentrum West00.0121.0NaN301.0NaNCANAL BOATTOUR AMSTERDAM covered boat 1,5 hour2017-07-23 13:06:07.44798952.3866104.8901280101000020E6100000FB03E5B67D8F13403D27BD6F7C314A40

Last rows

room_idsurvey_idhost_idroom_typecountrycityboroughneighborhoodreviewsoverall_satisfactionaccommodatesbedroomsbathroomspriceminstaynamelast_modifiedlatitudelongitudelocation
187132763386147614122005Private roomNaNAmsterdamNaNSlotervaart1185.021.0NaN36.0NaNComfortable SKY ROOM 12th floor2017-07-22 16:05:14.17317552.3610434.8461340101000020E6100000792288F37062134091B932A8362E4A40
18714192032561476132265798Private roomNaNAmsterdamNaNBijlmer Centrum10.041.0NaN35.0NaNNEW Stylish room, Ziggodome, AFAS LIVE, ArenA, RAI2017-07-22 16:05:14.16879952.3200494.9556090101000020E6100000950D6B2A8BD213400A0F9A5DF7284A40
18715197341781476139135665Private roomNaNAmsterdamNaNOsdorp00.010.0NaN30.0NaNCozy Apartment in Nieuw-West2017-07-22 16:05:14.16641052.3567024.7923460101000020E61000003677F4BF5C2B13407A354069A82D4A40
1871628896714761501422Private roomNaNAmsterdamNaNDe Baarsjes / Oud West2815.031.0NaN36.0NaNBandB de Baarsjes Amsterdam2017-07-22 16:05:14.16397352.3619184.8555070101000020E61000000DFFE9060A6C1340B8EA3A54532E4A40
187171668538314765831960Private roomNaNAmsterdamNaNBos en Lommer55.021.0NaN30.0NaNA nice bed in the attic of my 'palace'.2017-07-22 16:05:14.16171452.3796384.8488290101000020E6100000E695EB6D33651340D0285DFA97304A40
1871817789893147647501089Private roomNaNAmsterdamNaNBijlmer Centrum105.031.0NaN32.0NaN1-3 pers. Cozy Rm AFAS Live, ArenA, ZIGGODOME2017-07-22 16:05:14.15896352.3197944.9556380101000020E6100000684293C492D2134080BA8102EF284A40
1871916877166147667093870Private roomNaNAmsterdamNaNBijlmer Centrum65.041.0NaN24.0NaNModern Room by Arena, ZIGGO, HmH2017-07-22 16:05:14.15198652.3190804.9548220101000020E61000005801BEDBBCD1134062670A9DD7284A40
1872019859427147629724632Private roomNaNAmsterdamNaNGeuzenveld / Slotermeer00.011.0NaN38.0NaNPrivate single room2017-07-22 16:05:14.14961052.3840284.8384030101000020E61000002079E750865A1340C85F5AD427314A40
18721171321641476115156569Private roomNaNAmsterdamNaNCentrum West134.521.0NaN36.0NaNCity Center studio in Touristic Amsterdam 12017-07-22 16:05:14.14618352.3721204.8909820101000020E6100000774CDD955D9013400118CFA0A12F4A40
187227605782147639503013Private roomNaNAmsterdamNaNCentrum West1134.521.0NaN35.0NaNI have a room available for rent2017-07-22 16:05:12.25705452.3813924.8996580101000020E6100000CD565EF23F9913405F7AFB73D1304A40